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SPECTRAL ESTIMATION OF THE FRACTIONAL ORDER 
OF A LEVY PROCESS 

By Denis Belomestny 1 

Weierstrass-Institute Berlin 

We consider the problem of estimating the fractional order of a 
Levy process from low frequency historical and options data. An esti- 
mation methodology is developed which allows us to treat both esti- 
mation and calibration problems in a unified way. The corresponding 
procedure consists of two steps: the estimation of a conditional char- 
acteristic function and the weighted least squares estimation of the 
fractional order in spectral domain. While the second step is iden- 
tical for both calibration and estimation, the first one depends on 
the problem at hand. Minimax rates of convergence for the fractional 
order estimate are derived, the asymptotic normality is proved and 
a data-driven algorithm based on aggregation is proposed. The per- 
formance of the estimator in both estimation and calibration setups 
is illustrated by a simulation study. 

1. Introduction. Nowadays Levy processes are undoubtedly one of the 
most popular tool for modeling economic and financial time series [see, e.g., 
Cont and Tankov (2004), for an overview]. This is not surprising if one 
takes into account their simplicity and analytic tractability on the one hand 
and the ability to reproduce many stylized facts of financial time series 
on the other hand. In the last decade, new subclasses of Levy processes 
have been introduced and actively studied (mainly in the context of op- 
tion pricing). Among the best known models are normal inverse Gaussian 
processes (NIG), hyperbolic processes (HP), generalized hyperbolic pro- 
cesses (GHP) and truncated (or tempered) Levy processes (TLP). 
Boyarchenko and Levendorskii (2002) have introduced a general class of reg- 
ular Levy processes of exponential type (RLE) which contains all above 
mentioned particular Levy models. This type of processes is characterized 
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by the requirement that the modulus of the characteristic function of incre- 
ments behaves like exp(—rj\u\ a ) as \u\ — > oo for some < a < 2. Parameter a 
coincides with the fractional order of the underlying Levy process and plays 
an important role because it determines the decay of the characteristic func- 
tion and hence the smoothness properties of the corresponding state price 
density. Statistical inference for RLE processes is the subject of our paper. 

There are basically two types of statistical problems relevant for Levy 
processes: the estimation of parameters of a Levy process Xt from a time 
series of the asset St = exp(A^) and the calibration of these parameters using 
options data. Both problems have received much attention recently. 

Suppose that a Levy process X t is observed at n time points A, 2A, . . . , nA. 
Since Xq = 0, this amounts to observing n increments Xi = ^iA — -^(i-i)A > i 

1 // . If A is small (high-frequency data), then a large increment Xi m_ 

dicates that a jump occurred between time U-i and tj. Based on this in- 
sight and the continuous-time observation analogue, inference for the Levy 
measure of the underlying Levy process can be conducted. See, for exam- 
ple, A'it-Sahalia and Jacod (2006) for a semiparametric problem of estimat- 
ing volatility of a stable process under the presence of Levy perturbation 
or Lee and Mykland (2008) and Figueroa-Lopez and Houdre (2006) for the 
nonparametric problem of testing and estimation for jump diffusion models. 
For low-frequency observations, however, we cannot be sure to what extent 
the increment Xi is due to one or several jumps or just to the diffusion part 
of the Levy process. The only way to draw inference is to use the fact that 
the increments form independent realizations of infinitely divisible probabil- 
ity distributions. In this setting, a variety of methods have been proposed in 
the literature: standard maximum likelihood estimation DuMouchel (1973a, 
1973b, 1975), using the empirical characteristic function as an estimating 
equation [see, e.g., Press (1972), Fenech (1976), Feuerverger and McDun- 
nough (1981a), Singleton (2001)], maximum likelihood by Fourier inversion 
of the characteristic function Feuerverger and McDunnough (1981b), a re- 
gression based on the explicit form of the characteristic function Koutrou- 
velis (1980) or other numerical approximations Nolan (1997). Some of these 
methods were compared in Akgiray and Lamoureux (1989). Note that all of 
the aforementioned papers deal with the specific parametric (mainly stable) 
models. A semiparametric estimation problem for Levy models has recently 
been considered in Neumann and Reiss (2009) and Gugushvili (2008). 

The second calibration problem is of special importance for financial appli- 
cations because pricing of options is performed under an equivalent martin- 
gale measure, and one can infer on this measure only from options data. Since 
option data is sparse and the underlying inverse problem is usually ill-posed, 
we face a rather complicated estimation issue. Different approaches have 
been proposed in the literature to regularize the underlying inverse prob- 
lem. For example, in Cont and Tankov (2004) and Cont and Tankov (2006), 
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a method based on the penalized least squares estimation with the minimal 
entropy penalization is proposed. Belomestny and Reiss (2006) developed a 
spectral calibration method which avoids solving a high-dimensional opti- 
mization problem and is based on the direct inversion of a Fourier pricing 
formula with a cut-off regularization in spectral domain. This method essen- 
tially employees the integrability property of the underlying Levy measure 
(finite activity Levy processes) that excludes many interesting infinite ac- 
tivity Levy processes. 

In this paper we consider the problem of estimating the fractional or- 
der of a Levy process from low-frequency historical as well as options data. 
Our problem is a semiparametric one because we do not assume any spe- 
cific parametric model for the underlying process but only some asymptotic 
behavior. The spectral approach allows us to treat both estimation and cali- 
bration problems in a unified framework and leads to an efficient data-driven 
algorithm. Moreover, the fractional order estimate delivered by the spectral 
method possesses several interesting optimality properties. 

The problem of estimating the degree of activity of jumps in semimartin- 
gale framework using high-frequency financial data has recently been con- 
sidered in Ai't-Sahalia and Jacod (2009). On the one hand, small increments 
of the process turn out to be most informative for estimating the activity 
index. On the other hand, these small increments are the ones where the 
contribution from the continuous martingale part is mixed with the con- 
tribution from the small jumps. Ai't-Sahalia and Jacod (2009) proposed an 
estimation procedure which is able to "see through" the continuous part and 
consistently estimate the degree of activity for the small jumps under some 
restrictions on the structure of the underlying semimartingale. Note that in 
the case of Levy processes the degree of activity of jumps is identical to the 
fractional order of the underlying Levy process. We also stress that the case 
when both diffusion and jump components are presented can be treated in 
the framework of spectral estimation as well (see Section 6.9). 

Short outline of the paper. In Section 2 we introduce the class of RLE 
processes. Section 3 discusses some aspects of financial modeling with RLE 
processes. Section 4 describes the observational model. In Section 5 meth- 
ods of estimating the characteristic function of a Levy process from low- 
frequency historical and options data are presented. Section 6 is devoted 
to the spectral calibration method of estimating the fractional order a. We 
discuss here the problems of regularization and derive minimax rates of con- 
vergence for a class of Levy processes. In Section 7 an adaptive procedure 
for estimating a is presented, and its properties are discussed. We conclude 
with some simulation results. 

2. Regular Levy processes of exponential type. In this section we recall 
some basic properties of Levy processes. 
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2.1. Spectral properties of Levy processes. Consider a Levy process Xt 
with a Levy measure v. That is, X t is cadlag process with independent and 
stationary increments such that the characteristic function of its marginals 
<pt(u) is given by 

&(«):= E[e ittXt ] 

(2.1) 



u 2 a? 



exp< t \up h 



/ (e mx - l-iuxl { \ x \< 1} )v{dx) ) 



So, any Levy process X t is characterized by the so called Levy triple (p, o, v) 
where /i 6 I is a drift, a > is a diffusion volatility and v is a Levy mea- 
sure. Note that the drift p depends on the type of truncation in (2.1). In 
fact, this characterization is unique for a fixed truncation function and we 
can reconstruct the Levy triple from the characteristic function 4>t(u). This 
reconstruction may be viewed as consisting of three steps. First, because of 



(2.2) -L ( (e iux - 1 - mxl M<l} )v(dx) -> 0, 

we can find a 2 /2 as lim| u |_ i , 00 |u|~ 2 ';/'(n) with 

^(u) = t- 1 log(Mu)). 

Second, note that 

(ijj(u)-ijj(u + w))dw= / e mx p(d: 
-i Jr 



u —> oo, 



a? 2 



with 

$(u) = if)(u) + yit^, p(dx) = 2(^1 - ^^y{dx). 

Since p is a finite measure (j (x 2 A l)u{dx) < oo), one can uniquely recon- 
struct it (and hence v) from ip(u). Finally, we find p as lim u _ >00 [^(i()/(in)]. 
So, in principle, we can recover all characteristics of the underlying Levy pro- 
cess (including the fractional order) provided that 4>t is completely known. If, 
however, <pt is estimated from data we face an ill-posed estimation problem 
because a small perturbation in (fit may deteriorate its asymptotic behavior 
and lead to the violation of (2.2). In this case using a regularization tech- 
nique [see, e.g., Cont and Tankov (2004) or Belomestny and Reiss (2006)], 
we still can get an asymptotically consistent estimates for the whole triple 
(p,a,v) given a consistent estimate of (fit- 



Remark 2.1. A consistent estimation of tp{u) from a time series of Xt 
is only possible if the number of observations from the distribution with the 



SPECTRAL ESTIMATION OF THE FRACTIONAL ORDER OF A LEVY PROCESS 



cf. 4>t(u) for some t > increases. This can be either due to a decreasing time 
step in a times series of the process X (high frequency data) or due to an 
increasing time horizon (low frequency data). While the first type of obser- 
vational models has received much attention in recent years, there are only 
few papers dealing with low frequency data [see, e.g., Neumann and Reiss 
(2009)]. 

2.2. Fractional order of Levy processes. Let X% be a Levy process with 
a Levy measure v. The value 



a:=infjV>0: J \x\ T v(dx) < oo^ 



is called the fractional order or the Blumenthal-Getoor index of the Levy 
process Xf. This index a is related to the "degree of activity" of jumps. All 
Levy measures put finite mass on the set (— oo, — e] U [e, oo) for any arbitrary 
e > 0, so if the process has infinite jump activity, it must be because of the 
small "jumps," defined as those smaller than e. If v{\— e,e]) < oo the process 
has finite activity and a = 0. But if v{\— e, e]) = oo, that is, the process 
has infinite activity and in addition the Levy measure oo, — e] U [e, oo)) 
diverges near at a rate |e| _Q for some a > 0, then the fractional order of Xt 
is equal to a. The higher a gets, the more frequent the small jumps become 
[see Ai't-Sahalia and Jacod (2009) for more discussion]. 

The Blumenthal-Getoor index is closely related to the notion of the de- 
gree of jump activity that applies to general semimartingales as shown in 
Ai't-Sahalia and Jacod (2009), and reduces to the Blumenthal-Getoor index 
in the special case of Levy processes. 

Note also that the Blumenthal-Getoor index coincides with the stabil- 
ity index for stable processes. Another example of processes having a pre- 
scribed fractional order a is the class of tempered stable processes of order a. 
Boyarchenko and Levendorskh (2002) studied a generalization of tempered 
stable processes, called regular Levy processes of exponential type (RLE). A 
Levy process is said to be a RLE process of type [A_, A+] and order a £ (0, 2) 
if the Levy measure has exponentially decaying tails with rates A_ > and 
A+ > 



1 POO 

e x -^u{dy) < oo, / e x+y u{dy) < oo 

3 Jl 



(2.3) 

J — oo 

and behaves near zero as |y|~^ 1+c ^; 

J\y\>e e 

where n is some positive function on M + satisfying < n(+0) < oo. Obvi- 
ously, the fractional order of an RLE process of order a is equal to a. An 
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equivalent definition of an RLE process in terms of its characteristic expo- 
nent ip{u) can be given as follows. A Levy process is considered to be an RLE 
process of type [A_, A+] and order a S (0,2) if the following representation 
holds: 

(2.4) V(u) = i/«i + 0(u), /i£K, 

where function $ admits a continuation from R into the strip {z €C: Imz £ 
[— A_|_, A_]} and is of the form 

(2.5) $(u) = -\u\ a 7r(u), 
where tt(u) is a function satisfying 

limsup 1 7r (ix) | < oo and liminf |7r(u)| > 

|u|^oo |«|->oo 

such that 

(2.6) Rc[vr(u)] > 0, m£R\{0}. 

As was mentioned in the Introduction, the class of RLE processes includes 
among others hyperbolic, normal inverse Gaussian and tempered stable pro- 
cesses but does not include variance Gamma process. In the sequel we will 
mainly consider RLE processes without regularity conditions (2.3) (or equiv- 
alently with A_ = A+ = 0) since only the behavior of a Levy measure near 
zero matters for the fractional order of the corresponding Levy process. 

As mentioned before, in this work we are going to consider the problem 
of estimating the fractional order a of a Levy process Xt from a time series 
of asset prices as well as from option prices. Before turning to this, let us 
first make our modelling and observational framework more precise. 

3. Financial modelling. In this section we recall basic facts concerning 
financial modelling with exponential Levy models. 

3.1. Asset dynamics. We assume that the asset price St follows an expo- 
nential Levy model under both historical measure P and risk neutral measure 
Q. Specifically, we suppose that 

« = {Se x \ under P, 

t \ Se rt+Y t ^ un der Q, 

where X t and Y t are Levy processes, S > is the present value of the asset 
(at time 0) and r > is the riskless interest rate which is assumed to be 
known and constant. Note that the martingale condition for St under Q 
entails E^[e yt ] = 1. The martingale measure Q is in fact not unique under 
the presence of jumps. As is standard in the calibration literature, it is 
assumed to be settled by the market and to be identical for all options 
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under consideration. Processes Xt and Y% are related by the requirement 
that measures P and Q ought to be equivalent: P ~ Q. Interestingly, this 
implies that if Xt and Yj are RLE process and Xt is of order a p , then Yt 
has the order = a p . Indeed, the equivalence of the corresponding Levy 
measures i/ p and z/® implies [see Sato (1999)] 



(3.1) / (Jdv®/dv w - l)V(dx) < oo. 





Since for RLE processes du^(x) / 'dv r \x) x x ( qP -« q ) anc j di^(x) x x -( l + a]F ) d x 
as x — > +0, the condition (3.1) can be satisfied only if = a . This means 
that the fractional order of the underlying Levy process must be the same 
under both historical and risk-neutral measures. This not only indicates the 
importance of the fractional order parameter for financial applications but 
also suggests that the combination of two estimates of the fractional order a 
under P and Q might be useful, for example, to reduce the overall variance 
of the resulting combined estimator. 

3.2. Option pricing. The risk neutral price at time t = of the European 
call option with strike K and maturity T is given by 

C{K, T) = e~ rT E Q [(S T - K) + ] . 

Using the independence of increments, we can reduce the number of param- 
eters by introducing the so-called negative log-forward moneyness, 

y:=\og(K/S)-rT, 

such that the call price in terms of y is given by 

C{y,T) = SE Q [(e YT - e y ) + }. 

The analogous formula for the price of the European put option is V(y,T) = 
SEi^{(e y — e YT ) + ], and a well-known put-call parity is easily established; 

C{y,T) - V(y,T) = SE% Yt - e y ] = S{1 - e y ). 

As we need to employ Fourier techniques, we introduce the function 

(3.2) T (y):={ S Q l\ C ^ *>0, 



S^Viy^T), y<0. 

The function Ot records normalized call prices for y > and normalized 
put prices for y < 0. It possesses many interesting properties [see 
Belomestny and Reiss (2006) for details]; one of them being the following 
connection between the Fourier transform of Ot and the characteristic func- 
tion of Yt denoted by cj)j, : 

(3.3) F[0 T }(v)= 1 -f iv - i \ veR. 
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Another property which directly follows from (3.3) is that 
(3.4) / e-^0 T {y)dy < oo, 

provided that E[e 2 ^ T ] exists and is finite. 

4. Observations. We consider two kinds of observational models corre- 
sponding to two types of statistical problems we are going to tackle. While 
the first type of models assumes the time series of St is directly available, 
the second one supposes that only some functionals of St can be observed. 

4.1. Time series data. We assume that the values of the log-price process 
X t = log (St) on equidistant time grid tt = {to, t\, . . . , t n } are observed. 

4.2. Option data. As to option data, we assume that we will be given the 
prices of n call options for a set of forward log-moneynesses yo < y\ < ■ • ■ < y n 
and a fixed maturity T corrupted by noise. In terms of the function O, the 
following sample is available: 

(4.1) T (y j ) = T (y j ) + a(y j )C j , j = l,...,n. 

It is supposed that {£.,■} are independent, centered, random variables with 
E[£j] = 1 and sup^ E[£j] < oo. Furthermore, we assume that 

/ e~ 2y a 2 (y)dy < oo. 
il 

This condition is required because we need to transform the original regres- 
sion model (4.1) to an exponentially weighted one, 

(4.2) 6 T {y j ) = d T {y j ) + a{y j )t j) j = l,...,n, 

with 6 T {y) = e-VQriv), T {y) = e^O^y) and a(y) = e~V(y). 

As a matter of fact, a consistent estimation of the fractional order a is 
only possible if the amount of data available increases. In our asymptotic 
analysis we will therefore assume that the number of time series observations 
and the number of available options tend to infinity. 

5. Estimation of characteristic functions </> p and </rs The main idea of 
the spectral estimation method (SEM) is to infer on the parameters of the 
underlying model using its special structure in the spectral domain. Since 
spectral behavior of a RLE process is described explicitly by (2.4) and (2.5), 
we can apply SEM as soon as an estimate for the corresponding characteristic 
function is available. While estimation of (ft under P is rather straightforward, 
its calibration from option prices under Q requires special treatment. 



SPECTRAL ESTIMATION OF THE FRACTIONAL ORDER OF A LEVY PROCESS 



5.1. Estimation of (ft under P. We estimate the characteristic function 
by its empirical counterpart, 



1 n 

»4e 



iu{X t -X t ) 



i=i 



The empirical characteristic function <jff, possesses many interesting prop- 
erties, and we refer to Ushakov (1999) for a comprehensive overview. 



5.2. Estimation of eft under Q. For estimating cftf, we employ the Fourier 
technique. So, motivated by (3.3), we define 



(5.1) 



;(u) := 1 — u(u + i) 



.j=i 



u G 



where 5j = yj — Uj-i and Ot is defined in (4.2). For more involved methods 
of approximating F[Ot](u) sec Bclomcstny and Reiss (2006). 



6. Estimation of fractional order. In this section we turn to the problem 
of estimating the fractional order of a RLE process. To this aim we apply the 
spectral estimation method accompanied by a spectral cut-off regularization. 

6.1. Main idea. Let us consider a RLE process with the characteristic 
exponent ift(u) of the form (2.4) and (2.5). In the sequel we assume (mainly 
for the sake of simplicity) that liniu^-oo tt(u) = lim^^oo tt(u) = r\ € M + . In 
this case we can rewrite i? as 

(6.1) ${u) = -rj\u\ a r{u), 

where Re[r(n)] > for \ {0} and t(u) — > 1 as |n| — > oo. The formula, 

y(u):=log(-log(|<Ku)| 2 )) 

(6.2) 

= log(2r/) + a log(-u) + log(Re t(u)), u > 0, 

with (ft{u) = exp (■(/>(«)), suggests how to estimate a from (ft. Indeed, in terms 
of the new "data" y, we have a linear semiparametric problem with the 
"nuisance" nonparametric part log(Rer(u)). Since log(Rer(u)) tends to 
as |u| —> oo, we can get rid of this component by basing our estimation on 
y{u) with large \u\. On the other hand, if we plug-in an estimate (ft instead 
of (ft, the variance of y(u) will increase exponentially with \u\ [because of 
the exponential decay of (ft(u)], and we have to regularize the problem by 
cutting high frequencies. An appropriate weighting scheme would allow to 
take both effects into account. 
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6.2. Truncation. First, we truncate <j) to avoid the logarithm's explosion. 
Let 

y(u) := log(- log(7L_ iW+ [H 2 ](u))), \ {0}, 

where the truncation operator T 0J _^+ with truncation levels < o>_ < uj+ < 
1 is defined via 



' u+, f(u)>u+, 
/(«), w_</(u)<w + , 
w-, f(u)<u-, 



for any real- valued function /. 

6.3. Linearization. Truncation allows us to linearize the problem. Set 

W ±M : = '*(«)! l^l + ^logl^llJ- 
The following lemma holds: 

Lemma 6.1. For any u£K \ {0} and any cj_(n),cj+(u) satisfying 
< w_ < w* < < uj+ < 1, 
f/ie following inequality holds with probability one: 

\y(u) - y{u) - Ci(n)(|^)| 2 - |^(n)| 2 )| < C 2 (n)(|?(n)| 2 - |0(n)| 2 ) 2 , 
w/iere 

Ci(n) = 2- 1 |^)|- 2 log- 1 (|0(n)|) 

and 

"i+iio g (or 



C2 (u) = 2 max 

£€{u-{u),u}+(u)} 



e 2 iog 2 (e) 



Using the notation 

A(u) :=\0(u)\ 2 -\^u)\ 2 , 
Lemma 6.1 can be reformulated as follows: 

Corollary 6.2. For any u G R\ {0}, 

(6.3) y(u)-y(u) = ( 1 (u)A(u) + Q(u), 
where 

(6.4) \Q(u)\<( 2 (u)A 2 (u) 
with probability one. 
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Remark 6.1. Since 0(0) = 1 and (j){u) — > as \u\ — > oo, the behavior 
of truncation levels oj-(u) and uj + (u) in the vicinity of points u = and 
u = oo becomes important for determining the behavior of y(u) around 
these points. However, the values of y(u) around will be discarded while 
estimating a, and hence we do not need to know uj+(u) for small \u\. As to 
U-(u) and oj+(u) for large u, they can be constructed if some prior infor- 
mation on the Blumenthal-Getoor index a and the function ir(u) = ryr(u) 
is available. For instance, ifO<a<Q<57<2 and < 7r_ < Re[7r(u)] < 7r + 
for all \u\ > uo with large enough uq > 0, then one can take 

= Cie- 2w+H >|- 5 , M > u , 
oj + (u) = C 2 e~ 2n -^~, \u\ > Uq, 

with some constants C\ > and C 2 depending on tt + and 7r_, respectively. 
While a prior upper estimate a for a appears also in the minimax rates 
of convergence proved in Section 6.6, a lower estimate a turns out to be 
irrelevant for the convergence rates. 

Note that the slope coefficient £i grows exponentially with \u\ . This means 
that the variance of y(u) grows exponentially as well and the values of y(u) 
with large \u\ and should be discarded when estimating a. 

6.4. Spectral cut-off estimation. Taking into account the special semi- 
linear structure of (6.2) together with a heteroscedastic variance of y(u), 
we apply a weighted least squares method to estimate a. Let w 1 ^) be a 
function supported on [e, 1] with some e > that satisfies 

(6.5) / w 1 (u) log(it) du = 1, / w 1 (u)du = 0. 
Jo Jo 

For any U > put 

w u (u) = U~ 1 w 1 (uU~ 1 ) 
and define an estimate ajj of a as 

(6.6) au= w u {u)y{u)du. 

Jo 

It is instructive to see what happens with ajj in the case of exact data, that 
is, y = y. One can see that in this case the following decomposition holds: 



au = log(2f7) / w u {u)du + a 

Jo Jo 



CO 

U 



iv (u) log(u) du +Ru 
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with 

/•oo 

(6.7) R v := \ w u (u)log(ReT(u))du. 

Jo 

So, even in the case of perfect observations we still have the "bias" term Rjj 
induced by model misspecification. Indeed, when applying the least squares 
method we ignore a nonlinearity caused by Rjj and treat the problem as 
being linear. This is, however, only justified if Rjj is small. In fact, Rjj can 
be made small by taking large values of U. 

6.5. Further specification of the model class. In order to rigorously study 
the complexity of the underlying estimation problem, we have to make fur- 
ther assumptions about the model class. Let us consider a class of Levy 
models A(a, rj_, r] + , x) with 

(6.8) i/)(u) =i/m + #(u), ®(u) = -r]\u\ ol T{u), u€R, 
where < a < a < 2, 

(6.9) < i]- < 7] < n + < oo 
and 

(6.10) |i_ r ( u )|< * H^oo, 

\u\ x 

for some < x < a. We will write 

(a,?/,r) G A(a,r]-,r] +> x) 

to indicate that the Levy process with the characteristics (a,r/,r) is in 
the class A. The following proposition shows that conditions (6.8), (6.9) 
and (6.10) can be in fact rephrased in terms of the Levy density of a 
A(a, 77„, rj + , x) process. 

Proposition 6.3. Let v(x) be the Levy density of a Levy process satis- 
fying (6.8) where the function r fulfills 

(6.11) t{u) = 1 + D±u- r + o(|n|~ K ), u^±oo, 
with some constants D + and D_. Then 

(6.12) / x 2 u(x)dx = ce 2 - a 9(e), 

J\X\<6 

where c > is a constant depending on rj and a and the function 9(e) satisfies 

|0(e)-l|<|er, e^O. 

As will be shown in the next two sections, even in the class A(a, r]-,rj + , x) 
the problem of estimating a is severely ill-posed, that is, a small perturbation 
e in data may lead (in worst case) to log _X//Q (l/e) distance between a and 
its best estimate. On other hand, it turns out that our estimate ajj achieves 
the best possible rates of convergence in the class A(a, rj_, r] + , x). 
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6.6. Upper bounds. Let us define 



e :- 



3=1 



under P, 
under O. 



where 



T,7=iSla(y j ) = e-y^(y j ) and 5j 



In the case of 



calibration e comprises the level of the numerical interpolation error and 
of the statistical error simultaneously. In this section we will study the 
asymptotic behavior of the estimate ajj = defined in (6.6) as e — > 0, 

A := min{— yo,y n } — > oo and e~ A < ||<5|| 2 . Thus, it is assumed that the num- 
ber of historical observations as well as the number of available options tend 
to infinity. First, we present an upper bound showing that our estimate ajj 
with the "optimal" choice of the cut-off parameter U converges to a with a 
logarithmic rate in e. 



Theorem 6.4. For U = U with 

1 



and 

it holds 
(6.13) 

where 



U 



2 V+ 



log(e- 1 log^(l/ e )) 



l/a 



1 + x/a, 



under P, 



SUp i-> | ULJJ 



(x + 4)/a-l, under Q, 

^\ajj-a\ 2 <TZ{e), e -> 0, 



K{e) 



1 



loge 1 



-2k/ a 



Remark 6.2. Since the rates are logarithmic it is usual to call the under- 
lying estimation problem severely ill-posed. From a practical point of view, 
severely ill-posedness means that more observations are needed to reach the 
desired level of accuracy than for well-posed problems. 

Remark 6.3. As can be easily seen the convergence rates depend on a, 
a prior upper bound for a. If there is no prior information on a one may 
take a = 2. 
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Remark 6.4. For symmetric stable processes we have t(u) = 1 and it 
can be shown that the rates are parametric in this case, that is, 

sup E|5jj — a| 2 <e, e — > 0, 

(a,77,r)6.4(a,?7_,77-|-,cx>) 

for some U depending on e. 

6.7. Lower bounds. Now we show that the rates obtained in the previous 
section are the best ones in the minimax sense for the class A(a, r]—,r]+, x). 



Theorem 6.5. It holds 
(6.14) limliminfinf sup <5~^(e)E(|5 - a\ 2 ) = O(l), 



where 



0~n,s( £ ) 



2r, + 



logs 1 



- — x/(a— s) 



and the infimum is taken over all estimators a of a. 

6.8. Asymptotic behavior. In this section we complete the investigation 
of asymptotic properties of the estimate a by proving its asymptotic nor- 
mality. In the case of estimation under P we have the following: 



Theorem 6.6. Denote 



w u (u)w u (v)Ci (it)Ci {v)S(u, v) du dv 



.u 



1/2 



with 



S(u,v) := Re (j)(u — v) + Im 4>{u + v) 

- (Re 0(u) +Im0(u))(Re0('u) +Im <£(?;)). 

Let U = U{e) be a sequence of cutoffs such that <;~ l {e,U{e))Rij^ — > as 
Then 

r^e, U(e))(a u{£) - a) ~ Af(0, 1), e -> 0. 

Remark 6.5. The choice of U(e) is based on the following reasoning. 
On the one hand, we have to require that ? _1 (e, U)Ru — > in order to ensure 
that ? _1 (e, U){ajj — a) has asymptotically zero expectation. On the other 
hand, the variance of ? _1 (e, U)au should converge as e — > 0, and the limit 
must be bounded and nondegenerated. 
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Remark 6.6. Given an estimate (ft of (ft and some U = U{e) such that 
\<ft(u)\ ^0 on [—U,U] and \<p(u)\ / 1 on [—U,U] \{0}, we can estimate the 
norming factor ?(e, U) for a\j via 

-i 1/2 

w u (u)w u (v)£i(u)(i(v )S(u, v) du dv 



with 

S(u, v) := Kecft(u — v) + Im<fi(u + v) 

- (Re^(u) +Im0(u))(Re<j%) + lm.cft(v)) 

and 

Ci(n) := |0Wr 2 log^d^)] 2 ). 

A similar result can be proved in the case of calibration as well. 

6.9. Processes with a nonzero diffusion part. In fact, spectral calibration 
algorithm allows us to treat more general models with a nonzero diffusion 
part. Let A(a,a,n-,r] + , x) be a class of Levy processes with the character- 
istic exponent of the form 

(6.15) ij) a (u) =ifiu- aV/2 + i?(u), 0(u) = -rj\u\ a T(u), u£R, 

where < a < a and conditions (6.9) and (6.10) are fulfilled. We will write 
(a, a, r], r) € A(a,a, r/_, n +1 x) to indicate that a Levy process with the char- 
acteristic exponent (6.15) belongs to A(a, a, r/-, n + , x). 

Assume first that (ft a {u) = exp(ift a (u)) is known exactly. Define 

C{u) := log(|^» a (n)| 2 ) = -a 2 u 2 + 2Re[0(u)] 

and 

£ 5 (n):=^£( U )-£(eu)=log(|^( U )| 2 « 2 /l^(^)| 2 )=:log(^(n)) 
for some £ > 1. It obviously holds 

C^u) = -n\u\ a {£ 2 Re[T(u)} - f Re[r((«)]) = -c^a)\u\ a T^(u), 
where c^(a) = n(^ 2 — £ a ), and r^(n) fulfills 

(6.16) |i_ T€ ( u )|< |«|->oo. 

Thus, C^(u) has a structure similar to the structure of $(it) in (6.8) and we 
can carry over the results of the previous section to a more general model 
(6.15) by defining 

$t (u) := log (- log (7L_ >w+ ] (u) ) ) , 
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where p^(u) = \4>{u)\ 2 ^ 2 /\4>{^u)\ 2 with <j) being an estimate of <j) a . Define 



(6.17) 



w u {u)y^{u)du. 



The following two theorems are extensions of Theorems 6.4 and 6.5, respec- 
tively, to the case of Levy models with a nonzero diffusion part. 



Theorem 6.7. For U = U with 



U 



and P = 1 + >c/2, it holds 



llog^-Mog-^l/e)) 
la 



n i/2 



(6.18) 
where 



sup E\a^jy — a\ <lZ(e) 

(o,a,r?,T)6»4(a,a,?7-,?7+,x) 



0. 



Theorem 6.8. It holds 



(6.19) liminfinf 



£^0 



sup 



<5- 2 (e)E(|5-a| 2 ) = 0(l), 



where 



— loge 1 
2a B 



1 -x/ 2 



and the infimum is taken over all estimators a of a. 

As can be seen, the estimate 5^ jj is consistent as long as a < 2. The 
nearer is a to 2, the closer is the constant c^(a) to zero and the more 
difficult becomes the estimation problem. 

7. Adaptive procedure. Minimax results obtained in the previous sec- 
tions show the complexity of the underlying estimation problem but are not 
very helpful in practice. Putting aside the fact that they are related to the 
performance of the procedure in the worst situation (worst case scenario) 
which is not necessarily the case for the given model from A(ct, r]-,r]+,>c), 
the choice of U suggested there depends on a, is asymptotic and likely to 
be inefficient for small sample sizes. In this section we propose an adaptive 
procedure for choosing the cut-off parameter U. First, let us fix a sequence 
of cut-off parameters U\ > U2 > ■ • ■ > Uk and define 



w Uk (u)y(u)du, k = l,...,K. 
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We suggest a method based on the combination of multiple testing and 
aggregation ideas [see Belomestny and Spokoiny (2007)]. Namely, for the 
sequence of estimates a k consider a sequence of nested hypothesis H k : ct\ = 
■ ■ ■ = ctk = a where 

/•oo 

a k = w Uk (u)y(u)du, k = l,...,K. 
Jo 

The hypothesis H k basically means that Rjj i = for i = 1, . . . , k. The pro- 
cedure is sequential; we put Si = Si and start with k = 2 and at each step 
k the hypothesis H k is tested against H k -i- For testing H k against H k -i 
we check that the previously constructed adaptive estimate; a^-i belongs 
to the confidence intervals built on S&. Then we put 

(7.1) Sjfc = 7 fc 5fc + (l-7A.)Sjfe_i. 

The mixing parameter j k is defined using a measure of statistical difference 
between a k -i and a k 

7^ := /C(r fc /Vfc), T k := (a k - S fc _i) 2 ; /<r|, 

where a\ is the variance of a k , KL is a kernel supported on [0, 1] and {V k } is a 
set of critical values. In particular, j k is equal to zero if H k is rejected; that 
is, Sfc_i lies outside the confidence interval around a k . The final estimate is 
equal to olk- 

7.1. Choice of the critical values V k - The critical values Vi, . . . ,Vk-i 
are selected by a reasoning similar to the standard approach of hypothesis 
testing theory. We would like to provide prescribed performance of the pro- 
cedure under the simplest (null) hypothesis. In the considered set-up, the 
null means that 

(7.2) a\ = ■ ■ ■ = ax = ex. 

In this case it is natural to expect that the estimate a k coming out of the first 
steps of the procedure until index k is close to the nonadaptive counterpart 
Sfe- 

To give a precise definition we need to specify a loss function. Suppose 
that the risk of estimation for an estimate S of a is measured by E|S — a\ 2r 
for some r > 0. It is not difficult to show that under the null hypothesis 

(7.2) , each estimate a k asymptotically fulfills 

e" 1/2 (Sfc -a) ~A/"(0,oi), e^O. 

For example, in the case of estimation under P one can prove (see the proof 
of Proposition 6.6) that 

(7.3) a\ = \ \ w Uk (u)Ci(u)w Uk (v)C 1 (v)S(u,v)dudv 

Jo Jo 
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with 

S(u,v) := Kecj)(u — v) + \m.<j)(u + v) 

- (Re0(u) + Im0(u))(Re 4>{v) + lm.(j)(v)). 

Therefore, 

Eok^(5 fc -a) 2 r«a, 

where a\ = ecr 2 , C r = E\£\ 2r , and £ is the standard normal. We require the 
parameters Vi, • • • , Vk-i of the procedure to satisfy 

(7.4) EoK- 2 (S fc -5 fc ) 2 r< 7 a, k = 2,...,K. 

Here 7 stands for a preselected constant having the meaning of a confidence 
level of the procedure. This gives us K — 1 conditions to fix K — 1 parameters. 

Our definition still involves two parameters 7 and r. It is important to 
mention that their choice is subjective and there is no way for an automatic 
selection. A proper choice of the power r for the loss function as well as 
the "confidence level" 7 depends on the particular application and on the 
additional subjective requirements of the procedure. 



8. Simulations. 



8.1. Estimation of the fractional order from a time series. Let us con- 
sider the generalized hyperbolic (GH) Levy model which was introduced in 
a series of papers [Eberlein and Keller (1995), Eberlein, Keller and Prause 
(1998) and Eberlein and Prause (2002)] and emerged from extensive em- 
pirical investigations of financial time series. See also Eberlein (2001) for 
a survey on a number of analytical aspects of this model. The character- 
istic function <I?gh of increments in the GH Levy model with parameters 
(k, f3, 5, A) is given by 



( i > GU{U)=e^ ; 1 , 

(v^-(/3 + iu)2)A K X (5^TW) 

where K is the modified bessel function of the second kind, ^gh has the 
Levy-Khintchine representation of the form, 

^GH(ii) = exp ^ibu + J (e lux - 1 - iux)g(x) dx 

Note that this model does not contain a Gaussian component a 2 n 2 /2. Func- 
tion g(x), the density of the corresponding Levy measure, can be represented 
[see Eberlein (2001)] in an integral form. From this representation the fol- 
lowing expansion for p(x) = x 2 g{x) can be obtained; 

, . <5 A + l/2, 5/3 ,. 
p(x) = — I \x\-\ a: + of re ), x— >0. 

7T 2 7T 
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A direct consequence of this expansion is that 

g(x)dx>zl/e, e— >0, 



'\x\>£ 

and hence the fractional order of the GH Levy model is equal to 1. In our 
simulation study we simulate GH Levy process X with /3 = 0, A = 1 and 
different pairs of k and S at n + 1 equidistant points {0, A, . . . , nA}. Upon 
that we construct the empirical characteristic function of increments; 



n 



n 
k=l 

Following the description of the spectral estimation algorithm, define 

y(u) :=iog(-iog(r^ iW+ [|^| 2 ]( u ))), 

where truncation levels w_, and uj + are constant in u and are equal to 
0.01 and 0.95, respectively. In fact, for practical applications with a medium 
sample sizes n, the choice of these levels is not crucial. Now consider the 
following minimization problem: 

(8.1) (Z^f) = argmin / w u (u)(y(u) - h log(n) - l f du, 

l o,h Jo 

where w u (u) = C/~ 1 ItJ 1 ([/~ 1 u), and w 1 ^) = ul^ e<u<1 ^ for some e > 0. An 
estimate for the fractional order is defined as a u = l^. It is not difficult to 
show that oF is of the form, 

f°° ~ 
a u = w u (u)y(u)du 
Jo 

with w u (u) = U~ 1 w 1 (U~ l u) and = w 1 (u)[Ai log(u) — A2] where A± 

and A2 are two positive constants such that satisfies conditions (6.5). 

Let U\ > U2 > ■ ■ ■ > Uk be an exponentially decreasing sequence of cut- 
offs and Si,.. . ,ax be the corresponding sequence of estimates. Following 
(7.1), we construct a sequence of aggregated estimates Si,..., ok using a 
triangle kernel and a set of critical values Vi, . . . , Vk computed by (7.4). The 
variances {o^} in (7.3) are estimated from above using a bound for Box 
plots of a = oik based on 500 trials for different n and different pairs of k 
and 5 are shown in Figure 1. 

8.2. Estimation of the fractional order from options data. In the case 
of calibration (estimation under Q) we compute first the prices of n call 
options, 

C(y k ,T) = SE®[(e Y T - e»*)+], k = 1, . . . , n, 
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kappa=1 , delta=0.5 



kappa=1, delta=1 



kappa=2, delta=1 



kappa=5, delta=1 



Fig. 1. Box plots of the estimate a under 
parameters of the GH process. 



for different sample sizes and different 



using formula (3.3) where the underlying process Y follows a GH Levy model 
(parameters will be specified later on): S = 1, T = 0.25 and r = 0.06. The log- 
moneyness design (jji) is chosen to be normally distributed with zero mean 
and variance 1/3 and reflects the structure of the option market where much 
more contracts are settled at the money than in or out of money. Finally, 
we simulate 



OT(vj) = T (yj) + (T(yj)tj, 



j = l, 



,n, 



where £j are standard normal, Ot is defined in (3.2) and a{y) = [aOxiy)} 2 - 
In the first step of our estimation procedure we find the function O among 
all functions O with two continuous derivatives as the minimizer of the 
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penalized residual sum of squares 

n+1 ryn+i 

(8.2) RSS(0,L) = Y,(0 T (yi)-0( yi )) 2 + L [0"(u)] 2 du, 

where yo <C y\ and y n +i ^ Vn are two extrapolated points with artificial val- 
ues O n+ \ = Oo = 0. The first term in (8.2) measures closeness to the data, 
while the second term penalizes curvature in the function, and L establishes 
a trade-off between the two. The two special cases are L = when O interpo- 
lates the data, and L = oo when a straight line using ordinary least squares 
is fitted. In our numerical example we use the R package p-splines with 
the choice of L that minimizes the generalized cross-validation criterion. It 
can be shown that (8.2) has an explicit, finite dimensional, unique minimizer 
which is a natural cubic spline with knots at the values of yi,i = 1, . . . ,n. 
Since the solution of (8.2) is a natural cubic spline, we can write 

n 

d(i/) = J>/3i(iO, 

where f3j(y),j = 1, . . . ,n, is a set of basis functions representing the family 
of natural cubic splines. We estimate F[0](u + i) by 

n 

F[0]( U + i) = ^^F[e-^(y)](z;). 

3=1 

Although F[e~ y f3j(y)] can be computed in closed form, we just use the fast 
Fourier transform (FFT) and compute F[0](u + i) on a fine dyadic grid. On 
the same grid one can compute 

(8.3) ^(u):=-log(l + i;(t; + i)F[d](t; + i)), uGl, 

where log(-) is taken in such a way that ip(v) is continuous with i) = 
0. Now we can follow the road map of the adaptive spectral calibration 
algorithm and get an estimate for the fractional order of the underlying GH 
Levy model. In Figure 2 box plots of the final estimate a = olk based on 500 
Monte Carlo trials are shown in the case of the underlying GH Levy model 
with parameters (3 = — 1, A = 1 and different k,5. Sample size n is equal to 
1000 and noise level a takes values in the set {1, 10,20}. The estimate a is 
obviously biased because of numerical errors (due to the approximation of 
Fourier integral and linearization). 
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8.3. Processes with a nonzero diffusion part. Turn now to the class of 
Levy processes containing a nonzero diffusion part which was treated in 
Section 6.9. The only algorithmic difference to the case of processes with 
zero diffusion part is that now we first fix some £ > 1 and compute 

y^u) :=log(-log(T^^ + [|^| 2 ](n))) 

instead of y(u) where p^(u) = \4>{u)\ 2 ^ 2 /\4>{^u)\ 2 with <f> being an estimate 
of <p a . In the estimation procedure we consider only the set of u with 
|<^>(£u)| > 0. Note that this set is smaller than the set where |0(it)| > since 
£ > 1. It is also intuitively clear that more observations are needed to esti- 
mate p£ with the same quality as |</>(u)| 2 , and therefore the first problem is 
likely to be computationally more difficult. This conjecture is supported by 



kappa=2, delta=2 



kappa=5, delta=2 



kappa=2, delta=5 



kappa=5, delta=5 



Fig. 2. Box plots of the estimate a under Q for different noise levels and different sets 
of parameters of the underlying GH Levy process. 
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our simulation study as well. Figure 3 shows the boxplots of two estimates 
a and based on 500 samples under historical measure P from the GH 
Levy model with zero diffusion part (left) and with the diffusion parameter 
a equal to 0.1 (right), remaining parameters A, (3, k, and 5 being equal to 
1, 0, 1 and 4, respectively. The estimate is constructed from the esti- 
mates at-Uxfo ■■■■> ®U K ,£ [ we use C = 2 and W (u) = ^l{ e < u <i} in (8.1)] via the 
stagewise aggregation procedure as described in Section 7. We took K = 30, 



100(1. 25)-( fc " 1 ) ,k = l,...,K and JC(x) = (1 - x)l 



{0<a;<l}' 



As to the 



critical values, they are determined via (7.4) with r = 1, 7 = 0.5. Note that 
while the difference between 6^ and a is rather pronounced for small sample 
sizes, it almost disappears for sample sizes as large as 1000. 



APPENDIX 

A.l. Proof of Lemma 6.1. For any positive w_ and w+ satisfying u-{u) < 
•{u)\ 2 < uj + (u), we have 

\y(u) - y(u) - Ci(«)(T w [M a ](ii) - l^)l 2 )l 

<^(T^, w+ [|^| 2 ](n)-|0(n)| 2 ) 2 



< 



2 



mu)v-\4>{u)\ 2 ) 2 - 



Furthermore, 

Mu)\ 2 



[\cj>\ 2 }(u)\<Mu)\ 2 -\cf>( U )\% uGR d , 



kappa=1, delta=5, a=0 



kappa=1 , delta=5, a=0.1 



Fig. 3. Box plots of the estimates a (left) and (right) under P for different sample 
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and it holds on the set |^(u)| 2 ^ 

U{u)\ 2 - \<P{u)\ 2 \ > min{|0(n)| 2 - - \<p(u)\ 2 }. 

Thus, 



)i 2 - t^^uWu^ < C ^u{u)\ 2 - \m\ 2]2 



on the set \4>(u)\ 2 <£ [(*)—,(*)+], provided that 
2|^)| 2 |log(|0(n)|)|min{|^)| 2 - - \^(u)\ 2 } > 

that is, 



min< 1 



1>> 



)(n)| 4 log 2 (|0(n)| 2 ) 
l + |log(|</>(«)| 2 )| ' 

log(|<^)| 2 ) 



l + |log(|</)(n)| 2 )|- 



A. 2. Proof of Proposition 6.3. Without loss of generality we can assume 
that p = in (6.8). Denote 



p{x) 



smx . 
1 \v[x)\ 



then p is, up to a scaling factor, the density of some probability distribution 
with the characteristic function ( p ip(u) where Q p is a positive constant and 



(ip{u) — tp(u + w)) dw. 



Due to (6.11) the following asymptotic expansion holds: 

T{U + W) 



i/j(u) = \u\ a r(u) 



1 



C ± (a,K)\u\ a - 2 [l + 0{\u\- K )}, 



w 

1 + - 

u 



t(u) 



dw 



u — > ±oo, 



with some constants C + and C_ depending on a and k. We consider sepa- 
rately two cases. 

Case < a < 1. Note that in this case ip(u) is integrable on R and the 
Fourier inversion formula implies 

C [°° ~ 
p(x) = — (exp(— ixu) — l)ip(u) du 

2?r J-oo 
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since p(0) = 0. We have for any positive number a, 

(exp(— ixu) — l)ijj(u)du 



(exp(— ixu) — \)ip{u) du + (exp(— ixu) — \)ip(u) du 

|u|<a J\u\>a 

where |ii| < \x\ < |a;| 1 ~ a+K for x — > provided that k < a. Furthermore, 

I l—a+K\ 



I 2 = C±(a,K) I (exp(-irra) - l)\u\ a ~ 2 du + 0(\x\ 

J \u\ >a 

= C±(a,K)\x\ 1 ~ a [l + 0(\x\ K )}, x^±0, 
and (6.12) holds. 

Case 1 < a < 2. In this case we use the Fourier inversion formula for 
distribution functions to get 

f p( x )dx = ^ f^Re[^)]A, 
J\x\<e 71 JO u 

The representation 

sin(en) 



o 



ii 



Ke[ip(u)] du 



[ a sin(eu) ~ f°° sin(eu) ~ 

= / — - — L Re[ib(u) ]du+ / — - — J -Rehp(u)}du 

J0 U Ja U 

= :h+h 
and the asymptotic relation 

h = C + (a, K ) r S -^l u a - 2 du + 0(e 2 - a+K ) 

Ja U 

= C + (a,K)e 2 - a [l + 0{e K )], 
lead now to (6.12) provided that k < a — 1. 

A. 3. Proof of Theorem 6.4. The representation 

au-a= / w u {u){y{u)-y{u))du + Ru 



o 
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and Lemma 6.1 imply that 



E\a v -a\ 2 <3E 



(A.l) 



w u (u)Ci(u)A(u)du 



+ 3E 



oo 1 2 

/(«)( 2 (ii)A 2 («)da 



o 



+ 3|i2 C7 | : 



Let us consider the first term in (A.l); 

1 2 



E 



w u (u)Ci{u)A(u)du 







u; c/ (w)Ci('u)E[A(«)]d?/ 



+ Var 



Since 



CiW = 2- 1 |^)r 2 log- 1 (|^)|) = e 2 ^l QRc ^/(2r ? |nrRer(n)), 
we have 

/•oo 

/ w u {u)Ci(u)E[A(u)]du 
Jo 



(A.2) 



w 1 {u)( 1 {Uu)E[A(Uu)]du 

rl w l(^ e 2r l U a u a KeT{Uu) 



E[A(Uu)} du. 



2r]u a ReT(Uu) 

Due to the localization principle (Laplace method) and the identity 

E[A(«)] = E|0(u)| 2 - \cf>(u)\ 2 =s(l - \cf>(u)\ 2 ), 
the integral in (A.2) is asymptotically (as U — > oo) less than or equal to 

AeU~ a f 1 w\u)u- a e 2l i uaua du<eU- a e 2r > ua 

Jl-8 

with arbitrary small 5 > and some constant A > 0. Similarly, 



Var 



w u (u)d(u)A(u) du 




w u (u)w u (v)Ci{u)Ci{v) Cov(A(u), A(v)) dudv 



J 
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where again localization principle and the identity, 
Cov(|?(u)| a ,|£(t;)| a ) 

= 2e 3 {e~ 1 -lXe" 1 -2) 
x [Re(4>(u)4>(v)4>(—u — v)) 

+ BbM-uMvMu - v)) - 2|^)| 2 |^)| 2 ] 
+ ^(e" 1 - l)Mu + v)\ 2 + \4>{-u + v)\ 2 - 2|0(«)| 2 |0( U )| 2 ] ! 
are used. Turn now to the second term in (A.l); 



E 



Since 



and 



w u {u)Q 2 {u)A 2 (u)du 

) 

w u {u)( 2 {u)E[A 2 (u)]du 

|log|0W|| 



+ Var 



w u {u)Q 2 {u)A 2 {u)du 



C2H 



< 



u — f 00, 



EMu)\ 2 - |0(n)| 2 | 2 = E||«^)| 2 - E|<Mu)| 2 + E\<P(u)\ 2 - \0(u)\ 2 \ 2 

< 2E||0(n)| 2 - E|^(n)| 2 | 2 + 2|E|^(u)| 2 - |<^(n)| 2 | 2 

< e\(t)(u)\ 2 + e 2 , u^oo, 
we get an asymptotic estimate; 

/•oo 

/ w u (u)( 2 (u)E[A 2 (u)} du < e\J a e 2rilja + e 2 U a e^ ua , U^oo. 
Jo 

Similarly, one can prove that 



Var 



w u (u)( 2 (u)A 2 (u) du 



<e 2 U 2a e Ar > u \ U^oo. 



Finally, the third term in (A.l), 



Ru = / w u (u) log(Rer(w)) du, 



can be can be bounded by 
-1 



\Ru\ 



w (u) log(Rer(Mf/)) du 







<U- 1 \w l (y/U)\\\og(RzT{y))\dy 
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+ U-* \y\- H \w 1 (y)\dy<U-' t , U^oo, 
Jo 

for A > large enough. Combining all the previous estimates we get 
ElSj, - a\ 2 < eU- 2a e 2r > ua + e 2 U 2a e 4 ^ a + 

(A-3) 

< eU- m e 2 ^ ua + e 2 U m e^ ua + U~ 2 ", 

Finally the choice 



U — > oo. 



2r/+ 



logfc^log-^l/e)) 



with /? = 1 + >e/a leads to (6.13). 

In the case of the calibration problem we have 



(u)r = l-2Re 



+ u 2 (l + « 2 )^e iu ^-%)^^(5( % )0( 2 / / 

3,1=1 



and 



EU(u)r = l-2Re 



n 



+ u 2 (l + u 2 )Y,e iu{yi ^ ) d ] 5 l O(y j )0(y l ) 

3+1 
n 

+ u 2 (l + u 2 )^5 2 a 2 . 

3=1 

As was mentioned in Section 3.2, function O(y) = e~ v O(y) is nonnegative, 
Lipschitz and satisfies the Cramer condition, 

/ 0{y)e~ v dy < oo, 
provided that E[e 2 ^ T ] < oo. Under the condition e~ A < ||<5|| 2 we get 



/ e iu yd(y)dy-y2e iu ^5 j d(y j 

3 = 1 



<¥\\\ 
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as well as 



e luy O{y)dy 



2 n 



J2 e^'-^S^OiyjMyt) 

3,1=1 



< 

|"NJ 



0. 



Thus, 



|E|^)| 2 -|0(n)| 2 |< U 2 (l + U 2 )^5|(l + ^). 



Further, 



(u)\ 2 -E\(P(u)\ 2 = -2Re 



i(u + i) ^j a j^j el 



3=1 



+ 2u 2 (l + u 2 ) ^ j u to-*A8 j 8 l a j a£ j b 



3<l 
n 



+ u 2 (l + u 2 )£ ( 5 2 5 2 (£ 2 -l) 

3=1 



and 



E(|«^)| 2 - E|0(n)| 2 ) 2 < u 2 (l + u 2 ) £ <5 2 a 2 + u 4 (l + u 2 ) 2 £ 

3=1 3=1 

Using these inequalities, the first term in (A.l) can be estimated from above 



as 



E 



' f(X> 

/ w u (u)(i(u)A(u)du 
Jo 



<u 8-2a e 4 V U« m 4 + u 4-2a e 4r 1 U° 



Z^3 ?5 3 
■3=1 



<e 2 U 8 - m e^+ ua , 

while the second one is asymptotically negligible if e 2 C/ 8_2a e 4r?f/a — > 0. Tak- 
ing 



U ■ 



2rj + 



log(e- 1 log-^(l/ e )) 



l/a 



with j3 = {x + A)/a- 1, we get (6.13). 

A. 4. Proof of Theorem 6.5. For any two probability measures P and Q 
define 

fdP V 

lj dQ, if P<Q, 

otherwise. 



+oo, 
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The following proposition is the main tool for the proof of lower bounds in 
the estimation case and can be found in Butucea and Tsybakov (2004). 

Proposition A.l. Let V® := {Pg:0 e 6} be a family of models. As- 
sume that there exist 9\ and 62 in Q with \6\ — #2) > 25 n > such that 

P dl <.Pe 2 , X \P^ n ,P^ n )<^<h 

then 

liminfinf5- 2 max{E ei |a„ - 0i| 2 ,E fl J0 n - 9 2 \ 2 } > (1 - k) 2 (1 - , 

where the infimum is taken over all estimators 9 n (measurable function of 
observations) of the underlying parameter. 

Taking = A(a, ??+, x) and 0$ = (ai,r]i,Ti),i = 1, 2, we get from Propo- 
sition A.l, 

sup E(|a e — a| 2 ) > 5~ 2 max{Ei(|a e — ai| 2 ),E2(|a e — c^l 2 )}, 

{a,rj,T)£A(a,ri- ,rj+, x) 

provided that \a\ — c^l > 2<5 n > 0, and 

X 2(P®»,fg»)<«2<l. 

Turn now to the construction of models 0\ and 62- Let us consider a sym- 
metric stable model, 

ip(u) =i//u + i?(tt), = -r/ + |u| a , 0<a<l,uel. 

For any 5 satisfying < 5 < a and M > 0, define 

V\s(«) = i/uu + 1?<5 (u) ; 

where 

d s {u) = -ri+\u\ a l{\u\<M} ~ ri + cM-*) ^"'^ 1 + ^^{W^y 

Then (j)$(u) = exp(i/xu + $s( u )) is a characteristic function of some Levy 
process and 

Mu) = <P{u), \u\<M, 

where <f>{u) = exp(ifiu + $s(u)). Indeed, the function #5 (it) is a continuous, 
nonpositive, symmetric function which is convex on M + for large enough 
M and small enough c > 0. According to a well-known Polya criteria [see, 
e.g., Ushakov (1999)], the function exp(^i?j(w)) is a cf. of some absolutely 
continuous distribution for any £ > 0. In particular, for any natural n the 
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function exp(i?s(t()/n) is a cf. of some absolutely continuous distribution. 
Hence, exp(#$(ti)) is a cf. of some infinitely divisible distribution. Define 

(A.4) 0i = (a,77+,l), 8 2 = (a- 5,i] + ,t S) m) 

and 4>q 1 (u) = 4>(u),4>g 2 (u) = <f>s(u) with 

T8,m( U ) := \ U \ 1 {\u\<M] + (i+cM-x^ 1 + C ' U I ") 1 {|«I>A/}- 

If M 5 = 1 + cM~ K , that is, 

(A.5) 5 = log(l + cM~ K )/ log MxcAf-"/ log M, M^oo, 
then 

|t,5,m(«) - 1| ^ \u\~", |u|-^oo, 
and hence 62 G © = -4(a, 7?_, 77+, x). Furthermore, it holds 

^2,p®n p®nN _ v 2/_ _ x _ _ f My) ~ P9 2 (y)\ 2 . . 
X 1^0, >-^9 J- n X \P9i,P9 2 )- n / 7 \ "ZA 

where pgj and pe 2 are densities corresponding to cf. <f>e 1 and 4>g 2 , respectively. 
Using the fact that the density of stable law pg x (y) does not vanish on any 
compact set in M. and fulfills 

voAv) Z \y\~ {a+l \ \y\^oo, 

we derive 



nx (pOnPOs) <nCi / \poAv) - Po 2 (y)\ dy 
J\v\<a 

+ nC 2 [ \y\ a+l \peAy)-PeM\ 2 dy 

J\y\>A 



nCih + nC 2 l 



2 



for large enough A > and some constants 61,62 > 0. Using the fact that 
function 4'$ l (u) — <t>e 2 { u ) is two times differentiable (it is zero for \u\ < M) 
and Parseval's identity, we get 

h < — I \4> 9l {u) - (f)g 2 (u)\ 2 du 

< — f e- 2r >^ a - S du < M 1 ~ a+S e~ 2vMa ~ 6 , 
2ti" J\u\>m 



1 



2vr 



h<7r I |(<ta(«) - <Ae 2 («))"| 2 ^ 



|u|>M 



< / |«|6 e -2»j|«l a -* £iu <M 7 - a+ *e-^ 

J|«|>M 
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The choice M x [±- \og{e~ l log~ /3 (l/e))] 1 /( a ~ <5 ) with e = 1/n and some /3 > 
(7- (a - 5)) /2(a -5) yields 

e _1 X 2 (pei,Pe 2 ) < 1 

for small enough e. Combining this and (A. 5), we arrive at (6.14). 

For the proof of lower bounds in the case of calibration, one can employ 
the fact that the regression model, 

Orivi) = T {vi) + o{yi)Z,u k=Vi~ Vi-i, 

E[£?] = l, i = l,...,n, 

is equivalent to the Gaussian white noise model, 

dZ(x) = 6{y) dy + e l/2 dW{y) 

with the noise level asymptotics e — > 0, a two-sided Brownian motion W. 
Here the noise level e corresponds to the statistical regression error Y^=i ^j^j 
Furthermore, instead of x 2 distance we use the Kullback-Leibler divergence, 

KL(T dl ,T d2 ) = l [ \(Og 1 -de 2 )(y)\ 2 e- 1 dy, 

between two models Te 1 and % 2 corresponding to two Levy processes with 
characteristics 6\ and 62, respectively [see (A. 4)]. Simple calculations lead 
to the estimate, 

KL(T ei ,Te 2 )<e^M^-^ Ma - S 
with some 7 > 0. Hence, for small enough e > it holds 

KL(T ei ,Te 2 )<l 

provided that M x [^- log(e _1 log _/3 (l/e))] 1 /( a ~' 5 ) with j3 > <y/2(a-5). The 
Assouad lemma [see, e.g., Tsybakov (2008)] together with (A. 5) implies 
(6.14). 

A.5. Proof of Proposition 6.6. It holds for any fixed U, 

au-a = / w u (u)(y(u)-y{u))du 
Jo 

/•oo 

= / w u (u)( 1 ( y u)A(u)du 
Jo 

/•oo 

+ / w u {u)Q{u)du + Ru, 
Jo 

where Q is defined in (6.3). As shown in Lemma A. 2, the process e~ 1 / 2 A(u) 
converges weakly to a Gaussian process Z(u) with E[Z(u)] = and Cov(Z(u), 
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Z(v)) = S(u, v). Moreover, e~ l l 2 Q{u) — > 0, almost surely. The representation 
for 5(u) in Lemma A. 2 and CLT for [/-statistics implies that if for some se- 
quence U(e), it holds ? _1 ( £ )^f7( £ ) -> with 

/•oo 

? 2 (e)=e / w u ( £ \u)w u ( e Xv)( 1 (u)( 1 (v)S(u,v)dudv, 
Jo 

then (e) {pt.u{e) -<*)-> AT(0, 1) . 

A. 6. Proof of Theorem 6.7. We give only the sketch of the proof. Let 
and u + be two truncation levels satisfying < < p^(u) < w+(u) < 1 

and < uj- < p^(u)(l — log(p^(u))/(l + log(/3g («)))). First, similar to the 
proof of Proposition 6.1, one can show that 

\y^(u)-y(u)- Ci,$ (u){T 0jU+ [p^l (u) -p^(u))\< C2,c ( u ) ( T o,u+ [Pt,\( u ) ~ Pt( u )) 2 1 
where 

Cu(«) = -pf iog -1 (pe(«)) 

and 

• l + |log(g)| 
. # 2 log 2 (#) 

Furthermore, we have on the set {p^{u) <u>+(u)} 

|p € («) -r 0jW+ [^](u)| 

||<^n)| 2 -|^)| 2 | ||0 a («)| 2 ^ - \4>{u)\^ 

^ ^+1") 1 7 77 mo + 



(,2{u) = max 

0€{aj_(u),a>+(u)} 



and on the set {p^(u) > ui + (u)} it holds 

\pe(u) - T 0j u + [p ( ](u)\ < 2w+(«). 

Hence 

E|p € («) - T 0iW+ [p 5 ](u)| 2 

<2|0 a (^)|- 4 [E||0 a (^)| 2 -|0(^)| 2 | 2 

+ E||^ a ( U )| 2 « 2 - |0(n)| 2 « 2 | 2 ] + 4 W 2 (u)P(&(u) > u+(u)). 

Without loss of generality one can assume that there exists Uq > such that 
P^(u)/uj + (u) < 1/2 for u > Uq. Then it holds for u> Uq 

< P(||0 a (n)| 2 « 2 - \Hu)\ 2 ^\ > W+ («)|0«)| 2 /4) 
+ P(||0a«)| 2 " \H<)\ 2 \ > cu + (n)|^«)| 2 /4) 

< 16|0 a (^)|- 4 [E||^ a (^)| 2 - |^n)| 2 | 2 + E||0 a (n)| 2 « 2 - \4>{u)\^\ 2 }. 



34 



D. BELOMESTNY 



In the case of the estimation under P, for instance, we have 

mMHu)\ 2 -\Hau)\ 2 \ 2 <e, E\\Mu)\ 2e -\Hu)\ 2e \ 2 <e, e^O, 
and hence 

E\p ( (u) -T , w+ [^](n)| 2 <e\MCu)\~\ e^O. 
Now one can follow the proof of Theorem 6.4 and use the fact that 
Ci,^(u) X c^ 1 (a)|ii| _a T^ 1 («)exp(c^(o;)|ti| a T|(u)), u—> oo. 

A. 7. Proof of Theorem 6.8. Instead of Levy models 9\ and 82, one con- 
siders models 6>i >a and #2,a with characteristic exponents ip a (u) = ipu — 
a 2 u 2 /2 + $(u) and i/j a ,s(u) = ipu — a 2 u 2 /2 + $a(it), respectively. The rest 
of the proof is almost identical to the proof of Theorem 6.5. 

A. 8. Auxiliary results. The following lemma is a basic tool to investigate 
the asymptotic behavior of the estimate a under the historical measure P. 

Lemma A. 2. The process e -1 1 2 A (it) with A (it) = \4>{u)\ 2 -\4>{u)\ 2 weakly 
converges to a Gaussian process Z(u) with ~E[Z(u)] = and Cov(Z(u), Z(v)) = 
S(u,v) where 

S(u,v) := Kecj)(u — v) + lm<p(u + v) 

- (Re0(it) + Im0(it))(Re0(u) + Im <£(«)). 

Proof. We have 

- n n 

|0(n)| 2 = [Re0(n)] 2 + [Im0(u)] 2 = — ^^cos^X,- - X k )). 

j=i k=i 

Put 

#«(«) = (2) ^cos(n(X i -X fc )) = — ^— ^cosKX.-Xfe)), 

where summation c is over all (!J) combinations of 2 integers chosen from 
(1, . . . , n). Then 

e- l ' 2 mu)\ 2 - \<P{u)\ 2 ) = E 1 ' 2 + e- x l\\ - s)(H n - |^(n)| 2 ) - e l l 2 \4>{u)\ 2 ■ 

The first and third terms on the right-hand side converge to 0. Consider the 
middle term. Since H n (u) is an [/-statistic (for each u), e~ l l 2 (H n — \cj){u)\ 2 ) 
weakly converges to a Gaussian process with zero mean and covariance 

Cov[E X2 cos(u(Xi - X 2 )),E X2 cos(v(X l - X 2 ))] 
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(where Ex Y denotes the conditional expectation of Y given X). Let us 
compute this covariance. For any u, v G R it holds 

Cov{E X2 [cos(u{X 1 -X 2 ))],E X2 [cos(v(Xi-X 2 ))}) 

= E[(cos(nX 2 ) -Re(f)(u))Re(f)(u) + (sm(uX 2 ) - Im^(u)) Im^(u)] 

x [(cos(fX 2 ) — Re<p(v)) Re<p(v) + (sin(j;J2) — Im0(u )) Im0(?;)], 



where 



E(cos(«X 2 ) - Re(/>(u))(cos(vX 2 ) - Recf)(v)) 
Re 6(u + v) + Re d>{u — v 



2 

E(sin(uX 2 ) — Im0(n))(sin(i;A'2) — Im </>(?;)) 
Re <h(u — v) — Re 6(u + v 



Re(j)(u) Re<f)(v), 



\m(f)(u) \m(j){v) 



and 



E(cos(nX 2 ) - Re0(u))(sin(vX 2 ) - Im(j)(v)) 
Im <j)(v — u) + Im c/)(n + 



Re(/)(?i)Im^(t)). □ 
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